1,637 research outputs found

    Using Text Similarity to Detect Social Interactions not Captured by Formal Reply Mechanisms

    Full text link
    In modeling social interaction online, it is important to understand when people are reacting to each other. Many systems have explicit indicators of replies, such as threading in discussion forums or replies and retweets in Twitter. However, it is likely these explicit indicators capture only part of people's reactions to each other, thus, computational social science approaches that use them to infer relationships or influence are likely to miss the mark. This paper explores the problem of detecting non-explicit responses, presenting a new approach that uses tf-idf similarity between a user's own tweets and recent tweets by people they follow. Based on a month's worth of posting data from 449 ego networks in Twitter, this method demonstrates that it is likely that at least 11% of reactions are not captured by the explicit reply and retweet mechanisms. Further, these uncaptured reactions are not evenly distributed between users: some users, who create replies and retweets without using the official interface mechanisms, are much more responsive to followees than they appear. This suggests that detecting non-explicit responses is an important consideration in mitigating biases and building more accurate models when using these markers to study social interaction and information diffusion.Comment: A final version of this work was published in the 2015 IEEE 11th International Conference on e-Science (e-Science

    Automatic Face Recognition System Based on Local Fourier-Bessel Features

    Full text link
    We present an automatic face verification system inspired by known properties of biological systems. In the proposed algorithm the whole image is converted from the spatial to polar frequency domain by a Fourier-Bessel Transform (FBT). Using the whole image is compared to the case where only face image regions (local analysis) are considered. The resulting representations are embedded in a dissimilarity space, where each image is represented by its distance to all the other images, and a Pseudo-Fisher discriminator is built. Verification test results on the FERET database showed that the local-based algorithm outperforms the global-FBT version. The local-FBT algorithm performed as state-of-the-art methods under different testing conditions, indicating that the proposed system is highly robust for expression, age, and illumination variations. We also evaluated the performance of the proposed system under strong occlusion conditions and found that it is highly robust for up to 50% of face occlusion. Finally, we automated completely the verification system by implementing face and eye detection algorithms. Under this condition, the local approach was only slightly superior to the global approach.Comment: 2005, Brazilian Symposium on Computer Graphics and Image Processing, 18 (SIBGRAPI

    Notes from the Field: The Role of Datasets in Transitional Justice Research: The Case of Brazilian Truth Commission

    Get PDF
    In 2012, Brazilian President Dilma Roussef installed the Brazilian Truth Commission (CNV) to address gross human rights violations that occurred from 1946-1988.One of the most important sources of information available regarding this period is the files of the agencies that comprised the Brazilian intelligence system during the dictatorship. In total, there were around 12 million pages of relevant text in the National Archives. To make effective use of this trove of information, the CNV was challenged to use some data science tools to look for useful information within this huge dataset. As a result, a prototype of a data repository with selected documents (pdfs, images, etc.) has been created, which we summarize in this note. Computational tools for searching, organizing, and visualizing potentially important documents were developed and utilized to support CNV researchers. We also reflect upon the issues that complicated the CNV’s ability to gain access to reliable and comprehensive data and the limitations of analysis conducted with this type of research
    • …
    corecore